33 research outputs found
Contact Surface Area: A Novel Signal for Heart Rate Estimation in Smartphone Videos
We consider the problem of smartphone video-based heart rate estimation,
which typically relies on measuring the green color intensity of the user's
skin. We describe a novel signal in fingertip videos used for smartphone-based
heart rate estimation: fingertip contact surface area. We propose a model
relating contact surface area to pressure, and validate it on a dataset of 786
videos from 62 participants by demonstrating a statistical correlation between
contact surface area and green color intensity. We estimate heart rate on our
dataset with two algorithms, a baseline using the green signal only and a novel
algorithm based on both color and area. We demonstrate lower rates of
substantial errors (>10 beats per minute) using the novel algorithm (4.1%),
compared both to the baseline algorithm (6.4%) and to published results using
commercial color-based applications (>6%)
Generative Marginalization Models
We introduce marginalization models (MaMs), a new family of generative models
for high-dimensional discrete data. They offer scalable and flexible generative
modeling with tractable likelihoods by explicitly modeling all induced marginal
distributions. Marginalization models enable fast evaluation of arbitrary
marginal probabilities with a single forward pass of the neural network, which
overcomes a major limitation of methods with exact marginal inference, such as
autoregressive models (ARMs). We propose scalable methods for learning the
marginals, grounded in the concept of "marginalization self-consistency".
Unlike previous methods, MaMs support scalable training of any-order generative
models for high-dimensional problems under the setting of energy-based
training, where the goal is to match the learned distribution to a given
desired probability (specified by an unnormalized (log) probability function
such as energy function or reward function). We demonstrate the effectiveness
of the proposed model on a variety of discrete data distributions, including
binary images, language, physical systems, and molecules, for maximum
likelihood and energy-based training settings. MaMs achieve orders of magnitude
speedup in evaluating the marginal probabilities on both settings. For
energy-based training tasks, MaMs enable any-order generative modeling of
high-dimensional problems beyond the capability of previous methods. Code is at
https://github.com/PrincetonLIPS/MaM
Function-based Intersubject Alignment of Human Cortical Anatomy
Making conclusions about the functional neuroanatomical organization of the human brain requires methods for relating the functional anatomy of an individual's brain to population variability. We have developed a method for aligning the functional neuroanatomy of individual brains based on the patterns of neural activity that are elicited by viewing a movie. Instead of basing alignment on functionally defined areas, whose location is defined as the center of mass or the local maximum response, the alignment is based on patterns of response as they are distributed spatially both within and across cortical areas. The method is implemented in the two-dimensional manifold of an inflated, spherical cortical surface. The method, although developed using movie data, generalizes successfully to data obtained with another cognitive activation paradigm—viewing static images of objects and faces—and improves group statistics in that experiment as measured by a standard general linear model (GLM) analysis
Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets
The scale of functional magnetic resonance image data is rapidly increasing
as large multi-subject datasets are becoming widely available and
high-resolution scanners are adopted. The inherent low-dimensionality of the
information in this data has led neuroscientists to consider factor analysis
methods to extract and analyze the underlying brain activity. In this work, we
consider two recent multi-subject factor analysis methods: the Shared Response
Model and Hierarchical Topographic Factor Analysis. We perform analytical,
algorithmic, and code optimization to enable multi-node parallel
implementations to scale. Single-node improvements result in 99x and 1812x
speedups on these two methods, and enables the processing of larger datasets.
Our distributed implementations show strong scaling of 3.3x and 5.5x
respectively with 20 nodes on real datasets. We also demonstrate weak scaling
on a synthetic dataset with 1024 subjects, on up to 1024 nodes and 32,768
cores
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
The use of positional embeddings in transformer language models is widely
accepted. However, recent research has called into question the necessity of
such embeddings. We further extend this inquiry by demonstrating that a
randomly initialized and frozen transformer language model, devoid of
positional embeddings, inherently encodes strong positional information through
the shrinkage of self-attention variance. To quantify this variance, we derive
the underlying distribution of each step within a transformer layer. Through
empirical validation using a fully pretrained model, we show that the variance
shrinkage effect still persists after extensive gradient updates. Our findings
serve to justify the decision to discard positional embeddings and thus
facilitate more efficient pretraining of transformer language models.Comment: Accepted by ACL 202